
CLAIMS 



1 . A method for estimating the selectivity of queried in a relational database, 
comprising the steps of: 

constructing a probabilistic relational mocjfel (PRM) from said database; 

and 

performing online selectivity estimation for a particular query. 

2. The method of Claim 1, wherein saha PRM is constructed automatically, 
based solely on a data and space ^opated to said PRM. 

3. The method of Claim 1 , wherein said selectivity estimation step further 
comprises the step of: 

said selectivity estimator receiving as inputs both said query and said 
PRM, and outputting an estimate for a result size of said query. 

4. The method of Claim 1 , wherein the same PRM is used to estimate the size 
of a query over any/subset of attributes in said database; and wherein prior 
information about a query workload is not required. 



5. The method of Claim 1, wherein selectivity estimation is performed for 
select queries over a single table; and wherein a Bayesian network is used to 
approximat^joint distribution over an entire set of attributes in said table. 
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6. The method of Claim 1, wherein selectivity estimation is performed for 
queries over multiple tables; and wherein one/or more PRMs are used to 
accomplish both select and join selectivity esjlmation in a single framework. 

7. The method of Claim 1 , further conrtpn^ing the step of: 

learning PRMs with link uncertainty with a heuristic search algorithm. 

8. The method of Claim 7, vvherein said search algorithm comprises a greedy 
hill-climbing search, using random restarts to escape local maxima. 

9. A method for learning probabalistic relational fnodels (PRM) having 
attribute uncertainty, comprising the steps of: / 

providing a parameter estimation task byr 

inputting a relational schema that specifies a set of classes, having 
attributes associated with said classes' and having relationships between 
objects in different classes; / 

providing a fully specified instance of said schema in the form of a 
training database; and / 

performing a structure/learning task to extract an entire PRM solely 
from said training database/ 

10. The method of Clamri 9, said structure learning task comprising the step of 
specifying which structures are candidate hypotheses. 

11. The metKod of Claim 10, said structure learning task comprising the step 
of evaluajrfng different candidate hypotheses relative to. input data. 
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12. The method of Claim 11, said structure learning task comprising the step 
of searching hypothesis space for a structure having a high score. 

13. A method for learning probabalislfic relational models having link 
uncertainty, comprising the steps of: / 

providing a mechanism for modeling link uncertainty; and 
said mechanism computing sufficient statistics that include existence 
attributes without adding all nonexistent entities into a database. 

14. The method of claim 10, sa&Km^chanism comprising: 

let be a particular instanmtrbn of Pa(X.E); 

to compute C x E [true,ii]Ju$e a standard database query to compute 
how many objects xe O^iX) have Pa(x.E); 

to compute C XE [/a/se,Li], compute the number of potential entities 
without explicitly conside/ing each (x u ...,x k ) e O^V A ) x ••• O '(Y k ) by 
decomposing the computation as follows: 

let p be a reference slot of X with Rangefp] = Y\ 

let Pa(X.E) be the subset of parents of X.E along slot p; and 

let |i p be a corresponding instantiation; 

count a number of y consistent with |x p ; 

if Pa p (X.E) is/empty, this count is the | & ( V) |; 

wherein the/product of these counts is the number of potential entities; 
to compute C XmE [false t \i] f subtract C XE [true,\x] from said number. 
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15. A method for learning probabilistic relational models having link 
uncertainty, comprising the steps of: 

providing a mechanism ij^npfodeling link uncertainty; and 
said mechanism compuwnasufficient statistics that include reference 
uncertainty, comprising the steps of: 

fixing a set partition /attributes \|/[p]; and 
treating a variable S p as any other attribute in a PRM; 
wherein scoring success in predicting a value of said attribute given a 
value of its parents is performed using standard Bayesian methods. 
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